# SCT212-0056/2020 FAVOUR PAUL MUTURI LAB 2

(IF, ID, EX, MEM, WB) and the loop code provided:

### **Code Fragment:**

### Initial Setup:

- $\bullet$  R3 = R2 + 396
- Memory accesses take 1 cycle
- 5-stage classic RISC pipeline: IF-ID-EX-MEM-WB
- Branch resolved in ID stage
- Separate instruction/data memory (no structural hazard)

## a) No forwarding or bypassing

#### Notes:

- Data hazards must be resolved by inserting stalls (NOPs).
- Branch handled by flushing the pipeline after BNEZ (2 instructions after are flushed).

#### Hazards:

- 1. LD  $\rightarrow$  DADDI: RAW on R1 (need 2 NOPs)
- 2. DADDI → SD: RAW on R1 (need 2 NOPs)
- 3. DSUB → BNEZ: RAW on R4 (need 2 NOPs)

## Pipeline (one loop iteration):

| Cycle | Instruction |
|-------|-------------|
| 1     | LD          |
| 2     |             |
| 3     |             |
| 4     | DADDI       |
| 5     |             |
| 6     |             |
| 7     | SD          |
| 8     | DADDI (R2)  |
| 9     | DSUB        |
| 10    |             |
| 11    |             |
| 12    | BNEZ        |
| 13    | flush 1     |
| 14    | flush 2     |
| 15    | LD (next)   |

• Each iteration takes 15 cycles

• Total iterations: 3964=99\frac{396}{4} = 99

Total cycles =  $99 \times 15 = 1485$  cycles

# b) With forwarding + predict not taken

Hazards resolved by bypassing:

- Forward from EX/MEM or MEM/WB stages.
- Only one stall is needed: DSUB → BNEZ (R4 not ready until WB)

### Branch predicted not taken:

• Misprediction = 2 cycles penalty (instructions in IF/ID get flushed)

### Pipeline per iteration:

| Cycle | Instruction |  |
|-------|-------------|--|
| 1     | LD          |  |
| 2     | DADDI       |  |
| 3     | SD          |  |
| 4     | DADDI (R2)  |  |
| 5     | DSUB        |  |
| 6     | BNEZ        |  |
| 7     | flush 1     |  |
| 8     | flush 2     |  |
| 9     | LD (next)   |  |

Each iteration takes 8 cycles

• Total: 99 × 8 = 792 cycles

## c) With delayed branch + forwarding

Branch delay slot: next instruction always executed, regardless of branch

Goal: Fill delay slot with useful instruction that's safe

• Best choice: move DADDI R2, R2, 4 after BNEZ

• Updated sequence:

loop: LD R1, 0(R2)
DADDI R1, R1, 1
SD 0(R2), R1
DSUB R4, R3, R2
BNEZ R4, loop
DADDI R2, R2, 4 # delay slot

No stalls needed due to forwarding; branch resolved with delay slot

### Pipeline per iteration:

| Cycle | Instruction             |  |
|-------|-------------------------|--|
| 1     | LD                      |  |
| 2     | DADDI                   |  |
| 3     | SD                      |  |
| 4     | DSUB                    |  |
| 5     | BNEZ                    |  |
| 6     | DADDI (R2) (delay slot) |  |

| 7 | LD (next) |
|---|-----------|
|   |           |

• Each iteration takes 6 cycles

• Total: 99 × 6 = 594 cycles

## Summary of Execution Times:

| Case | Description                        | Cycles/Iter | Total<br>Cycles |
|------|------------------------------------|-------------|-----------------|
| а    | No forwarding, stall & flush       | 15          | 1485            |
| b    | With forwarding, predict not taken | 8           | 792             |
| С    | Forwarding + delayed branch        | 6           | 594             |